Parametric Kernels for Sequence Data Analysis

نویسندگان

  • Young-In Shin
  • Donald S. Fussell
چکیده

A key challenge in applying kernel-based methods for discriminative learning is to identify a suitable kernel given a problem domain. Many methods instead transform the input data into a set of vectors in a feature space and classify the transformed data using a generic kernel. However, finding an effective transformation scheme for sequence (e.g. time series) data is a difficult task. In this paper, we introduce a scheme for directly designing kernels for the classification of sequence data such as that in handwritten character recognition and object recognition from sensor readings. Ordering information is represented by values of a parameter associated with each input data element. A similarity metric based on the parametric distance between corresponding elements is combined with their problemspecific similarity metric to produce a Mercer kernel suitable for use in methods such as support vector machine (SVM). This scheme directly embeds extraction of features from sequences of varying cardinalities into the kernel without needing to transform all input data into a common feature space before classification. We apply our method to object and handwritten character recognition tasks and compare against current approaches. The results show that we can obtain at least comparable accuracy to state of the art problem-specific methods using a systematic approach to kernel design. Our contribution is the introduction of a general technique for designing SVM kernels tailored for the classification of sequence data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Gaussian Processes with Word-Sequence Kernels for Bayesian Text Categorization

We address the problem of multi-labelled text classification using word-sequence kernels. However, rather than applying them with Support Vector Machine as in previous work, we chose a classifier based on Gaussian Processes. This is a probabilistic non-parametric method that retains a sound probabilistic semantics while overcoming the limitations of parametric methods. We present the empirical ...

متن کامل

Ensemble Kernel Learning Model for Prediction of Time Series Based on the Support Vector Regression and Meta Heuristic Search

In this paper, a method for predicting time series is presented. Time series prediction is a process which predicted future system values based on information obtained from past and present data points. Time series prediction models are widely used in various fields of engineering, economics, etc. The main purpose of using different models for time series prediction is to make the forecast with...

متن کامل

A Summary of Recent Progress on Efficient Parametric Approximations of Viability and Discriminating Kernels

Viability and discriminating kernels are powerful constructs for analyzing system safety through model checking, but until recently the only computational algorithms available were nonparametric gridbased approaches which, although accurate, scaled exponentially with the dimension of the system’s state space. In contrast, several polynomial complexity reachability algorithms have been developed...

متن کامل

Degenerate Parametric Integral Equations System for Laplace Equation and Its Effective Solving

In this paper we present application of degenerate kernels strategy to solve parametric integral equations system (PIES) for two-dimensional Laplace equation in order to improve its computing time. The main purpose of this paper is to obtain degenerate kernels for PIES based on non-degenerate kernels and to apply collocation method to solve modified PIES. We verify this method on two examples, ...

متن کامل

Comparing SVM sequence kernels: A protein subcellular localization theme

Kernel-based machine learning algorithms are versatile tools for biological sequence data analysis. Special sequence kernels can endow Support Vector Machines with biological knowledge to perform accurate classification of diverse sequence data. The kernels relative strengths and weaknesses are difficult to evaluate on single data sets. We examine a range of recent kernels tailor-made for biolo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007